CleanEST: a database of cleansed EST libraries

نویسندگان

  • Byungwook Lee
  • Gwangsik Shin
چکیده

The EST division of GenBank, dbEST, is widely used in many applications such as gene discovery and verification of exon-intron structure. However, the use of EST sequences in the dbEST libraries is often hampered by inconsistent terminology used to describe the library sources and by the presence of contaminated sequences. Here, we describe CleanEST, a novel database server that classified dbEST libraries and removes contaminants. We classified all dbEST libraries according to species and sequencing center. In addition, we further classified human EST libraries by anatomical and pathological systems according to eVOC ontologies. For each dbEST library, we provide two different cleansed sequences: 'pre-cleansed' and 'user-cleansed'. To generate pre-cleansed sequences, we cleansed sequences in dbEST by alignment of EST sequences against well-known contamination sources: UniVec, Escherichia coli, mitochondria and chloroplast (for plant). To provide user-cleansed sequences, we built an automatic user-cleansing pipeline, in which sequences of a user-selected library are cleansed on-the-fly according to user-selected options. The server is available at http://cleanest.kobic.re.kr/ and the database is updated monthly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences

We present a web-based server, called ESTpass, for processing and annotating sequence data from expressed sequence tag (EST) projects. ESTpass accepts a FASTA-formatted EST file and its quality file as inputs, and it then executes a back-end EST analysis pipeline consisting of three consecutive steps. The first is cleansing the input EST sequences. The second is clustering and assembling the cl...

متن کامل

ESTAP-an automated system for the analysis of EST data

The EST Analysis Pipeline (ESTAP) is a set of analytical procedures that automatically verify, cleanse, store and analyze ESTs generated on high-throughput platforms. It uses a relational database to store sequence data and analysis results, which facilitates both the search for specific information and statistical analysis. ESTAP provides for easy viewing of the original and cleansed data, as ...

متن کامل

EHPnet: Doctors and the Environment

We present a web-based server, called ESTpass, for processing and annotating sequence data from expressed sequence tag (EST) projects. ESTpass accepts a FASTA-formatted EST file and its quality file as inputs, and it then executes a back-end EST analysis pipeline consisting of three consecutive steps. The first is cleansing the input EST sequences. The second is clustering and assembling the cl...

متن کامل

P-215: Discovery of A Novel APA Variant of A Human Potential Gene Based on Expressed Sequenced Tags Analysis

Background: Expressed sequence tags (ESTs) are sequences of cDNA fragments prepared from different tissue sources. There are over one million of these sequences in the publicly available database, and these sequences are believed to represent more than half of all human genes. The ESTs belong to different cDNA libraries, was prepared from one particular cell type, organ, or tumor. Therefore, th...

متن کامل

Update of the Diatom EST Database: a new tool for digital transcriptomics

The Diatom Expressed Sequence Tag (EST) Database was constructed to provide integral access to ESTs from these ecologically and evolutionarily interesting microalgae. It has now been updated with 130,000 Phaeodactylum tricornutum ESTs from 16 cDNA libraries and 77,000 Thalassiosira pseudonana ESTs from seven libraries, derived from cells grown in different nutrient and stress regimes. The updat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 37  شماره 

صفحات  -

تاریخ انتشار 2009